Spectral control in concatenative speech synthesis
نویسندگان
چکیده
We report on research in which we increased the degree of spectral control in concatenative synthesis by controlling the formant frequencies of the synthetic speech, as well as the energies in four spectral bands. In addition, we eliminated “points” of concatenation in favor of “regions” of concatenation, by crossfading between the end and the beginning of two speech segments that are part of a concatenation operation. We hypothesized that these approaches would decrease the frequency and severity of audible discontinuities in the synthetic speech and thus also increase the perceived quality of the speech. A listening test determined that stimuli created with the proposed methods resulted in significantly increased quality.
منابع مشابه
Generating emotional speech with a concatenative synthesizer
We describe the attempt to synthesize emotional speech with a concatenative speech synthesizer using a parameter space covering not only f0, duration and amplitude, but also voice quality parameters, spectral energy distribution, harmonics-to-noise ratio, and articulatory precision. The application of these extended parameter set offers the possibility to combine the high segmental quality of c...
متن کاملSpectral modification for concatenative speech synthesis
Concatenative synthesis can produce high-quality speech but is limited to the allophonic variations and voice types that were captured in the database. It would be desirable to modify speech units to remove formant discontinuities and to create new speaking styles, such as hypoor hyper-articulated speech. Unfortunately, manipulating the spectral structure often leads to degraded speech quality....
متن کاملOn the Detection of Discontinuities in Concatenative Speech Synthesis
Last decade considerable work has been done in finding an objective distance measure which is able to predict audible discontinuities in concatenative speech synthesis. Speech segments in concatenative synthesis are extracted from disjoint phonetic contexts and discontinuities in spectral shape and phase mismatches tend to occur at unit boundaries. Many feature sets —most of them of spectral na...
متن کاملSpectral Continuity Measures at Mandarin Syllable Boundaries
In Text-to-Speech (TTS) systems based on concatenative synthesis, the naturalness of synthetic speech is highly affected by the spectral continuities at the concatenation point. In this paper, we focused on 4 kinds of syllable boundaries in mandarin and used several spectral distance measures combined with time derivatives distance measures to predict their audible discontinuities. A perceptual...
متن کاملSpectral smoothing for concatenative speech synthesis
This paper addresses the topic of performing e ective concatenative speech synthesis with a limited database by proposing methods to smooth the transitions between speech segments. The objective is to produce naturalsounding speech via segment concatenation when formants and other spectral features do not align properly. We propose several methods for adjusting the spectra between waveform segm...
متن کامل